Updated extrapolation docs for clarity and adding an image #15325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

sfanahata wants to merge 2 commits into master from extrapolation-doc-improvements

+27 −22

Contributor

sfanahata commented Oct 27, 2025

DESCRIBE YOUR PR

Updated extrapolation docs for clarity and adding an image to make it clearer how extrapolation works.

IS YOUR CHANGE URGENT?

Help us prioritize incoming PRs by letting us know when the change needs to go live.

Urgent deadline (GA date, etc.):
Other deadline:
None: Not urgent, can wait up to 1 week+

SLA

Teamwork makes the dream work, so please add a reviewer to your PRs.
Please give the docs team up to 1 week to review your PR unless you've added an urgent due date to it.
Thanks in advance for your help!

PRE-MERGE CHECKLIST

Make sure you've checked the following before merging your changes:

Checked Vercel preview for correctness, including links
PR was reviewed and approved by any necessary SMEs (subject matter experts)
PR was reviewed and approved by a member of the Sentry docs team


          updating extrapolation docs for clarity and adding an image

123789e

vercel bot commented Oct 27, 2025 •

edited

Loading

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
develop-docs	Ready	Preview	Comment	Oct 28, 2025 5:21pm

1 Skipped Deployment

Project	Deployment	Preview	Comments	Updated (UTC)
sentry-docs	Ignored	Preview		Oct 28, 2025 5:21pm

sentry bot reviewed

View reviewed changes

develop-docs/application-architecture/dynamic-sampling/extrapolation.mdx Show resolved Hide resolved

vercel bot deployed to Preview – develop-docs

October 27, 2025 21:34

View deployment

sfanahata requested a review from Dhrumil-Sentry

October 27, 2025 21:43


          small tweaks

4028cd7

vercel bot deployed to Preview – develop-docs

October 28, 2025 17:21

View deployment

Dhrumil-Sentry reviewed

View reviewed changes

develop-docs/application-architecture/dynamic-sampling/extrapolation.mdx

    
              ---

              Dynamic sampling reduces the amount of data ingested, for reasons of both performance and cost. When configured, a fraction of the data is ingested according to the specified sample rate of a project: if you sample at 10% and initially have 1000 requests to your site in a given timeframe, you will only see 100 spans in Sentry. Without making up for the sample rate, any metrics derived from these spans will misrepresent the true volume of the application. When different parts of the application have different sample rates, there will even be a bias towards some of them, skewing the total volume towards parts with higher sample rates. This bias especially impacts numerical attributes like latency, reducing their accuracy. To account for this fact, Sentry uses extrapolation to smartly combine the data to account for sample rates.

              [Dynamic sampling](/application-architecture/dynamic-sampling) reduces the amount of data ingested, to help with both performance and cost. When configured, a fraction of the data is ingested according to the specified sample rates within a project. For example, if you sample 10% of 1000 requests to your site in a given timeframe, you will see 100 spans in Sentry.

Contributor

Dhrumil-Sentry Oct 30, 2025

Suggested change

      
            [Dynamic sampling](/application-architecture/dynamic-sampling) reduces the amount of data ingested, to help with both performance and cost. When configured, a fraction of the data is ingested according to the specified sample rates within a project. For example, if you sample 10% of 1000 requests to your site in a given timeframe, you will see 100 spans in Sentry. 
          
            Client and Server side sampling reduces the amount of data ingested, to help with both performance and cost. When configured, a fraction of the data is ingested according to the specified sample rates within a project. For example, if you sample 10% of 1000 requests to your site in a given timeframe, you will see 100 spans in Sentry.

develop-docs/application-architecture/dynamic-sampling/extrapolation.mdx

    
              - **Accuracy** refers to data being correct. For example, the measured number of spans corresponds to the actual number of spans that were executed. As sample rates decrease, accuracy also goes down because minor random decisions can influence the result in major ways.

              - **Expressiveness** refers to data being able to express something about the state of the observed system. Expressiveness refers to the usefulness of the data for the user in a specific use case.

              - **Usefulness** refers to data being able to express something about the state of the observed system, and the value of the data for the user in a specific use case. For example, a metric that shows the P90 latency of your application is useful for understanding the performance of your application, but a metric that shows the P90 latency of different endpoints in your application sampled at 10%, 1%, and 5% is not as useful because it is not a complete picture.

Contributor

Dhrumil-Sentry Oct 30, 2025

Suggested change

      
            - **Usefulness** refers to data being able to express something about the state of the observed system, and the value of the data for the user in a specific use case. For example, a metric that shows the P90 latency of your application is useful for understanding the performance of your application, but a metric that shows the P90 latency of different endpoints in your application sampled at 10%, 1%, and 5% is not as useful because it is not a complete picture.
          
            - **Usefulness** refers to data being able to express something about the state of the observed system, and the value of the data for the user in a specific use case. For example, a metric that shows the P90 latency of your application is useful for understanding the performance of your application, but a metric that shows the P90 latency of different endpoints in your application sampled at 10%, 1%, and 5% may not be as useful because it cannot represent the complete picture.

develop-docs/application-architecture/dynamic-sampling/extrapolation.mdx

    
              - **Sample mode** does not extrapolate and presents exactly the data that was ingested - targeting accuracy, especially for small datasets.

              Depending on the context and the use case, one mode may be more useful than the other. Generally, default mode is useful for all queries that aggregate on a dataset of sufficient volume. As absolute sample size decreases below a certain limit, default mode becomes less and less expressive. There are scenarios where the user needs to temporarily switch between modes, for example, to examine the aggregate numbers first and dive into the number of samples for investigation. In both modes, the user may investigate single samples to dig deeper into the details.

              Depending on the context and the use case, one mode may be better suited than the other. Generally, default mode is useful for all queries that aggregate on a dataset of sufficient volume. As absolute sample size decreases below a certain limit, default mode becomes less and less useful. There are scenarios where you may need to temporarily switch between modes, for example, to examine the aggregate numbers first and dive into the number of samples for investigation. In both modes, you may investigate single samples to dig deeper into the details.

Contributor

Dhrumil-Sentry Oct 30, 2025

Suggested change

      
            Depending on the context and the use case, one mode may be better suited than the other. Generally, default mode is useful for all queries that aggregate on a dataset of sufficient volume. As absolute sample size decreases below a certain limit, default mode becomes less and less useful. There are scenarios where you may need to temporarily switch between modes, for example, to examine the aggregate numbers first and dive into the number of samples for investigation. In both modes, you may investigate single samples to dig deeper into the details.
          
            Depending on the context and the use case, one mode may be better suited than the other. Generally, default mode is useful for all queries that aggregate on a dataset of sufficient volume. As absolute sample size decreases below a certain limit, default mode becomes less and less useful. There are scenarios where you may need to temporarily switch between modes, for example, to track usage and identify which endpoints or operations consume most spans

develop-docs/application-architecture/dynamic-sampling/extrapolation.mdx

    
              - **Default mode** extrapolates the ingested data as outlined below.

              - **Sample mode** does not extrapolate and presents exactly the data that was ingested.

              - **Default mode** extrapolates the ingested data as outlined below - targeting usefulness.

              - **Sample mode** does not extrapolate and presents exactly the data that was ingested - targeting accuracy, especially for small datasets.

Contributor

Dhrumil-Sentry Oct 30, 2025

Let's please not have a Samples Mode concept - this is just turning off extrapolation - so we can call it "Unextrapolated Mode" ?

develop-docs/application-architecture/dynamic-sampling/extrapolation.mdx

    
              1. **When both sample rate and event volume are low**: Extrapolation becomes less reliable in these cases. You can either increase your sample rate to improve accuracy, or switch to sample mode to examine the actual events - both are valid approaches depending on the user's needs.

              1. **When both sample rate and event volume are low**: Extrapolation becomes less reliable in these cases. You can either increase your sample rate to improve accuracy, or switch to sample mode to examine the actual events - both are valid approaches depending on your needs.

              2. **When you have a high sample rate but still see low event volumes**: In this case, increasing the sample rate won't help capture more data, and sample mode will give you a clearer picture of the events you do have.

Contributor

Dhrumil-Sentry Oct 30, 2025

Would suggest rewriting this part - we never want ppl to disable extrapolation for anything other than debugging usage -

develop-docs/application-architecture/dynamic-sampling/extrapolation.mdx

    
              ### Confidence

              When users filter on data that has a very low count but also a low sample rate, yielding a highly extrapolated but low-sample dataset, developers and users should be careful with the conclusions they draw from the data. The storage platform provides confidence intervals along with the extrapolated estimates for the different aggregation types to indicate when there is elevated uncertainty in the data. These types of datasets are inherently noisy and may contain misleading information. When this is discovered, the user should either be very careful with the conclusions they draw from the aggregate data or switch to non-default mode for investigation of the individual samples.

              When you filter on data that has a very low count but also a low sample rate, yielding a highly extrapolated but low-sample dataset, you should be careful with the conclusions you draw from the data. The storage platform provides confidence intervals along with the extrapolated estimates for the different aggregation types to indicate when there is lower confidence in the data. These types of datasets are inherently noisy and may contain misleading information. When this is discovered, you should either be very careful with the conclusions you draw from the aggregate data or switch to sample mode to investigate the individual samples.

Contributor

Dhrumil-Sentry Oct 30, 2025

We should include this image here: https://docs.sentry.io/product/explore/trace-explorer/#sampling-warnings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet